RBM-SMOTE: Restricted Boltzmann Machines for Synthetic Minority Oversampling Technique

نویسندگان

  • Maciej Zieba
  • Jakub M. Tomczak
  • Adam Gonczarek
چکیده

The problem of imbalanced data, i.e., when the class labels are unequally distributed, is encountered in many real-life application, e.g., credit scoring, medical diagnostics. Various approaches aimed at dealing with the imbalanced data have been proposed. One of the most well known data pre-processing method is the Synthetic Minority Oversampling Technique (SMOTE). However, SMOTE may generate examples which are artificial in the sense that they are impossible to be drawn from the true distribution. Therefore, in this paper, we propose to apply Restricted Boltzmann Machine to learn an intermediate representation which transform the SMOTE examples to the ones approximately drawn from the true distribution. At the end of the paper we perform an experiment using credit scoring dataset.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning

In recent years, mining with imbalanced data sets receives more and more attentions in both theoretical and practical aspects. This paper introduces the importance of imbalanced data sets and their broad application domains in data mining, and then summarizes the evaluation metrics and the existing methods to evaluate and solve the imbalance problem. Synthetic minority oversampling technique (S...

متن کامل

Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In th...

متن کامل

A Synthetic Minority Oversampling Method Based on Local Densities in Low-Dimensional Space for Imbalanced Learning

Imbalanced class distribution is a challenging problem in many real-life classification problems. Existing synthetic oversampling do suffer from the curse of dimensionality because they rely heavily on Euclidean distance. This paper proposed a new method, called Minority Oversampling Technique based on Local Densities in Low-Dimensional Space (or MOT2LD in short). MOT2LD first maps each trainin...

متن کامل

Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem

The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regio...

متن کامل

Visually Debugging Restricted Boltzmann Machine Training with a 3D Example

Restricted Boltzmann Machines (RBMs) are being applied to a growing number of problems with great success. In the process of training an RBM one must pick a number of parameters, but often these parameters are brittle and produce poor performance when slightly off. Here we describe several useful visualizations to assist in choosing appropriate values for these parameters. We also demonstrate a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015